Skip to content

Do not exceed ES _id length limit#905

Merged
yurishkuro merged 7 commits intojaegertracing:masterfrom
tmszdmsk:779-ES-autogenerate-id
Jul 10, 2018
Merged

Do not exceed ES _id length limit#905
yurishkuro merged 7 commits intojaegertracing:masterfrom
tmszdmsk:779-ES-autogenerate-id

Conversation

@tmszdmsk
Copy link
Copy Markdown
Contributor

@tmszdmsk tmszdmsk commented Jul 2, 2018

Which problem is this PR solving?

Short description of the changes

  • let's store a hash of the serviceId so that we don't exceed length limit of _id

Signed-off-by: Tomasz Adamski <tomasz.adamski@gmail.com>
@codecov
Copy link
Copy Markdown

codecov bot commented Jul 2, 2018

Codecov Report

Merging #905 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@          Coverage Diff          @@
##           master   #905   +/-   ##
=====================================
  Coverage     100%   100%           
=====================================
  Files         126    126           
  Lines        6070   6074    +4     
=====================================
+ Hits         6070   6074    +4
Impacted Files Coverage Δ
plugin/storage/es/spanstore/service_operation.go 100% <100%> (ø) ⬆️
plugin/storage/es/spanstore/writer.go 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 48bbbae...8e909f5. Read the comment docs.

@tmszdmsk tmszdmsk changed the title leave document _id generation to Eleasticsearch don't exceed ES _id length limit Jul 4, 2018
@harnash harnash force-pushed the 779-ES-autogenerate-id branch 2 times, most recently from 44ffbdf to af89809 Compare July 4, 2018 08:51
Instead of using arbitrary string (which length can be quite big) as
document _id we compute sha256 from it instead.

Signed-off-by: Łukasz Harasimowicz <dev@harnash.eu>
@harnash harnash force-pushed the 779-ES-autogenerate-id branch from af89809 to 3489723 Compare July 4, 2018 09:23

import (
"context"
"crypto/sha256"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no need for crypto hash, see

jaeger/model/hash.go

Lines 30 to 36 in d16de31

func HashCode(o Hashable) (uint64, error) {
h := fnv.New64a()
if err := o.Hash(h); err != nil {
return 0, err
}
return h.Sum64(), nil
}

}
serviceID := fmt.Sprintf("%s|%s", service.ServiceName, service.OperationName)
hashedID := fmt.Sprintf("%x", sha256.Sum256([]byte(serviceID)))
cacheKey := fmt.Sprintf("%s:%s", indexName, serviceID)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why we cannot use the hash for both document ID and cacheKey. Let's just have a single value.

Signed-off-by: Łukasz Harasimowicz <dev@harnash.eu>
@harnash harnash force-pushed the 779-ES-autogenerate-id branch from c007a8e to 10fcf67 Compare July 5, 2018 08:50
@harnash
Copy link
Copy Markdown

harnash commented Jul 5, 2018

@yurishkuro thanks for the comments. I hopefully addressed them all in 10fcf67

OperationName: "operation",
}

serviceHash, err := model.HashCode(service)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this introduces unnecessary public method to Service type. Instead I would simply add something like this:

func (s Service) hashCode() string {
    h := fnv.New64a()
    h.Write(s.ServiceName)
    h.Write(s. OperationName)
    return fmt.Sprintf("%x", h.Sum64())
}

}

func (s Service) hashCode() string {
h := fnv.New64a()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be cached?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@black-adder It could be but it will introduce issue with concurrency. Every simultaneous call to this function would potentially cause race condition and would render unpredictable hash results. I would need to introduce some kind of semaphore and mark this part as critical section. Do you think it is worth?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unpredictable hash results -> shouldn't this always produce the same hash results for the same key even if run concurrently? I actually dont know the inner workings of the implementation so maybe completely wrong, we can skip for now but can you add a TODO to investigate in the future (unless you already know that there will be race conditions).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@black-adder so my reasoning is like this.

  1. You store cached hasher object somewhere (most likely in the writer.go) and use it whenever someone calls hashCode()
  2. First process (1) calls hashCode(), fetches (creates if not cached) hasher object
  3. (1) Pushes data: OperationName and ServiceName to the hasher
  4. Second process (2) calls hashCode() which fetches the same (cached) hasher object
  5. (2) pushes its OperationName and ServiceName to the same hasher
  6. (1) calculates checksum by calling Sum64() on the same hasher (which now holds data from two possibly different instances of Service)
  7. (2) does the calculations too and gets the same (wrong) hash as (1)

I did actually run into this issue while fixing the tests :-)

Even if I introduce hash.Reset() just after Sum64() or before data is pushed it the hasher I still have no guarantees that other process is not doing some operation on the cached object. Putting Write semaphore would probably do the trick but I think it is a bit overkill in this case.

I hope this helps.

indexService.On("Index", stringMatcher(indexName)).Return(indexService)
indexService.On("Type", stringMatcher(serviceType)).Return(indexService)
indexService.On("Id", stringMatcher("service|operation")).Return(indexService)
indexService.On("Id", stringMatcher(serviceHash)).Return(indexService)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a big fan of using the same hash function here to test that the service was hashed, I'd prefer if you generated this hash once and compare against the actual hash string.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed calls to hashCode() and put static hash in those tests.

harnash and others added 2 commits July 10, 2018 11:00
@yurishkuro yurishkuro merged commit 9ffc2fc into jaegertracing:master Jul 10, 2018
@yurishkuro yurishkuro changed the title don't exceed ES _id length limit Do not exceed ES _id length limit Jul 10, 2018
@tmszdmsk tmszdmsk deleted the 779-ES-autogenerate-id branch July 10, 2018 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants